Define TR caching comms in ATD by aryx · Pull Request #353 · semgrep/semgrep-interfaces

aryx · 2025-02-19T13:48:13Z

To make things more concrete about the interface we want
with the backend.

test plan:
make

I ran make setup && make to update the generated code after editing a .atd file (TODO: have a CI check)
I made sure we're still backward compatible with old versions of the CLI.
For example, the Semgrep backend need to still be able to consume data
generated by Semgrep 1.50.0.
See https://atd.readthedocs.io/en/latest/atdgen-tutorial.html#smooth-protocol-upgrades
Note that the types related to the semgrep-core JSON output or the
semgrep-core RPC do not need to be backward compatible!

To make things more concrete about the interface we want with the backend. test plan: make

github-actions · 2025-02-19T13:49:20Z

Backwards compatibility summary:

Checking backward compatibility of semgrep_output_v1.atd against past version v1.100.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.101.0
Skipping v1.102.0 because commit 1c82453e89e0b569630e48ddde015e201df0e5f9 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.103.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.104.0
Skipping v1.106.0 because commit 5e0c767ec323f3f2356d3bf8dbdf7c7836497d8a has already been checked
Skipping v1.107.0 because commit 5e0c767ec323f3f2356d3bf8dbdf7c7836497d8a has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.108.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.109.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.110.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.111.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.75.0
Skipping v1.76.0 because commit 9102031608aa4154e1c37f557550ec4eabc8780c has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.77.0
Skipping v1.78.0 because commit dcb5d77b420ddee61f58aadd3c2c7aef38778154 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.79.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.80.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.81.0
Skipping v1.82.0 because commit 9e0f3bec26b07b4fb6753a32cb75277f45f2572c has already been checked
Skipping v1.83.0 because commit 9e0f3bec26b07b4fb6753a32cb75277f45f2572c has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.84.0
Skipping v1.84.1 because commit 3daef49297ada205359cc1d2996354c94b628b0d has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.85.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.86.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.87.0
Skipping v1.88.0 because commit 512c0bd97db59c48a5705b2741662a338776e438 has already been checked
Skipping v1.89.0 because commit 512c0bd97db59c48a5705b2741662a338776e438 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.90.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.91.0
Skipping v1.92.0 because commit 2351c5e528cb7430422208dc66707894c066b508 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.93.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.94.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.95.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.96.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.97.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.98.0
Skipping v1.99.0 because commit 60809032a2e39742f42910d46b3e5dd305b8b8cf has already been checked

bkettle · 2025-02-20T01:29:17Z

semgrep_output_v1.atd

+type tr_cache_key = {
+    rule_id: rule_id;
+    (* ex: http://some-website/hello-world.0.1.2.tgz like in found_dependency *)
+    resolved_url: string;


I don't know if this will be easy to find in many cases, though I agree that if it's possible it makes the best key. I guess it is probably fine to start with this, and add another key later if needed.

yes we can always refine. This is just defining the interface. Once we start the implementation we will discover
we need to refine it.

Thoughts on or experience with PackageURL?

Also curious how we determine the url to fetch the code for a package, in the event that our dependency scanning logic doesn't yield a resolved_url?

bkettle · 2025-02-20T01:31:18Z

semgrep_output_v1.atd

+     * and [transitive_unreachable] records?
+     * TODO? make it a list? match_results: ... list; ?
+     *)
+    match_result: sca_match_kind;


This is a bit odd, since when scanning a package with a rule it will result in direct code matches, not sca matches. Maybe we should just return the match here? Then, the CLI could treat those matches the same as matches that it receives from a call to Semgrep locally.

We could. This is a bigger data structure to store then though, and for TR what we really need is actually just the sca_transitive_match_kind; that's the thing we try to optimize to avoid downloading the dependency and run semgrep on it.

I think we should at least store the match locations. I don't think it makes sense to store sca_match_kind because if there are matches in multiple packages, we will somehow need to combine those into a single finding in the cli

bkettle

lgtm

To make things more concrete about the interface we want with the backend. test plan: make - [x] I ran `make setup && make` to update the generated code after editing a `.atd` file (TODO: have a CI check) - [x] I made sure we're still backward compatible with old versions of the CLI. For example, the Semgrep backend need to still be able to *consume* data generated by Semgrep 1.50.0. See https://atd.readthedocs.io/en/latest/atdgen-tutorial.html#smooth-protocol-upgrades Note that the types related to the semgrep-core JSON output or the semgrep-core RPC do not need to be backward compatible!

Define TR caching comms in ATD

82c8b8d

To make things more concrete about the interface we want with the backend. test plan: make

aryx requested review from a team, aaronmichaelacosta, bkettle and emjin and removed request for a team and emjin February 19, 2025 13:48

bkettle reviewed Feb 20, 2025

View reviewed changes

aryx requested a review from gautambhat February 26, 2025 07:55

aryx added 2 commits March 6, 2025 18:01

Merge branch 'main' into tr_cache

b2d0ad8

ben and gautam's comments

057ff1b

bkettle approved these changes Mar 6, 2025

View reviewed changes

gautambhat approved these changes Mar 6, 2025

View reviewed changes

Merge branch 'main' into tr_cache

a900a8f

aryx merged commit 488ac75 into main Mar 7, 2025
3 checks passed

aryx deleted the tr_cache branch March 7, 2025 06:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define TR caching comms in ATD#353

Define TR caching comms in ATD#353
aryx merged 4 commits intomainfrom
tr_cache

aryx commented Feb 19, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Feb 19, 2025 •

edited

Loading

Uh oh!

bkettle Feb 20, 2025

Uh oh!

aryx Feb 20, 2025

Uh oh!

gautambhat Feb 27, 2025

Uh oh!

bkettle Feb 20, 2025

Uh oh!

aryx Feb 20, 2025

Uh oh!

bkettle Feb 21, 2025

Uh oh!

bkettle left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aryx commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bkettle Feb 20, 2025

Choose a reason for hiding this comment

Uh oh!

aryx Feb 20, 2025

Choose a reason for hiding this comment

Uh oh!

gautambhat Feb 27, 2025

Choose a reason for hiding this comment

Uh oh!

bkettle Feb 20, 2025

Choose a reason for hiding this comment

Uh oh!

aryx Feb 20, 2025

Choose a reason for hiding this comment

Uh oh!

bkettle Feb 21, 2025

Choose a reason for hiding this comment

Uh oh!

bkettle left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aryx commented Feb 19, 2025 •

edited

Loading

github-actions bot commented Feb 19, 2025 •

edited

Loading